Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TT-10070] Fix/sanitize error logging #6817

Merged
merged 12 commits into from
Jan 9, 2025
Merged

Conversation

titpetric
Copy link
Contributor

@titpetric titpetric commented Jan 3, 2025

User description

https://tyktech.atlassian.net/browse/TT-10070


PR Type

Bug fix, Enhancement


Description

  • Removed unnecessary PID file reading and improved PID handling.

  • Enhanced Redis connection handling with explicit connect calls.

  • Improved webhook validation by checking for empty target paths.

  • Added test adjustments and cleanup for PID-related functionality.


Changes walkthrough 📝

Relevant files
Enhancement
config.go
Add default PID file location configuration                           

config/config.go

  • Added a default PIDFileLocation configuration.
+1/-0     
server.go
Improve Redis connections and PID handling                             

gateway/server.go

  • Added explicit Redis connection calls across multiple instances.
  • Improved PID handling by removing unnecessary file reads.
  • Enhanced storage connection initialization with timeout handling.
  • Refactored PID-related logic for clarity and reliability.
  • +41/-26 
    Bug fix
    api_loader.go
    Skip mTLS warning when running tests                                         

    gateway/api_loader.go

    • Added a check to skip mTLS warning during tests.
    +2/-1     
    event_handler_webhooks.go
    Improve webhook validation and URL checks                               

    gateway/event_handler_webhooks.go

  • Added validation for empty webhook target paths.
  • Ensured URLs are not empty before parsing.
  • +4/-1     
    Tests
    server_test.go
    Update tests for PID handling changes                                       

    gateway/server_test.go

  • Removed references to readPIDFromFile in tests.
  • Adjusted test setup to align with new PID handling logic.
  • +6/-6     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    @buger
    Copy link
    Member

    buger commented Jan 3, 2025

    💔 The detected issue is not in one of the allowed statuses 💔

    Detected Status In Refinement
    Allowed Statuses In Dev,In Code Review,Ready for Testing,In Test,In Progress,In Review ✔️

    Please ensure your jira story is in one of the allowed statuses

    Copy link
    Contributor

    github-actions bot commented Jan 3, 2025

    API Changes

    --- prev.txt	2025-01-09 22:42:56.411418928 +0000
    +++ current.txt	2025-01-09 22:42:51.933418463 +0000
    @@ -4722,6 +4722,7 @@
     			Enabled:     false,
     			AllowUnsafe: []string{},
     		},
    +		PIDFileLocation: "/var/run/tyk/tyk-gateway.pid",
     	}
     )
     var Global func() Config
    @@ -8542,7 +8543,7 @@
     	RPCListener          RPCStorageHandler
     	DashService          DashboardServiceSender
     	CertificateManager   certs.CertificateManager
    -	GlobalHostChecker    HostCheckerManager
    +	GlobalHostChecker    *HostCheckerManager
     	ConnectionWatcher    *httputil.ConnectionWatcher
     	HostCheckTicker      chan struct{}
     	HostCheckerClient    *http.Client

    Copy link
    Contributor

    github-actions bot commented Jan 3, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Possible Security Warning

    The new condition in the loadApps function checks for mTLS configuration but does not enforce it. This could lead to inconsistent security measures if not properly configured. Ensure this logic is robust and tested.

    if !gw.isRunningTests() && gw.allApisAreMTLS() && !gw.GetConfig().Security.ControlAPIUseMutualTLS && !controlApiIsConfigured {
    	mainLog.Warning("All APIs are protected with mTLS, except for the control API. " +
    		"We recommend configuring the control API port or control hostname to ensure consistent security measures")
    Redis Connection Handling

    Multiple explicit Connect calls to Redis are added. Ensure these connections are properly managed, closed when no longer needed, and do not introduce resource leaks or performance issues.

    	healthCheckStore.Connect()
    
    	gw.InitHostCheckManager(gw.ctx, &healthCheckStore)
    }
    
    gw.initHealthCheck(gw.ctx)
    
    redisStore := storage.RedisCluster{KeyPrefix: "apikey-", HashKeys: gwConfig.HashKeys, ConnectionHandler: gw.StorageConnectionHandler}
    redisStore.Connect()
    
    gw.GlobalSessionManager.Init(&redisStore)
    
    versionStore := storage.RedisCluster{KeyPrefix: "version-check-", ConnectionHandler: gw.StorageConnectionHandler}
    versionStore.Connect()
    
    err := versionStore.SetKey("gateway", VERSION, 0)
    
    if err != nil {
    	mainLog.WithError(err).Error("Could not set version in versionStore")
    }
    
    if gwConfig.EnableAnalytics && gw.Analytics.Store == nil {
    	Conf := gwConfig
    	Conf.LoadIgnoredIPs()
    	gw.SetConfig(Conf)
    	mainLog.Debug("Setting up analytics DB connection")
    
    	analyticsStore := storage.RedisCluster{KeyPrefix: "analytics-", IsAnalytics: true, ConnectionHandler: gw.StorageConnectionHandler}
    	analyticsStore.Connect()
    
    	gw.Analytics.Store = &analyticsStore
    	gw.Analytics.Init()
    
    	store := storage.RedisCluster{KeyPrefix: "analytics-", IsAnalytics: true, ConnectionHandler: gw.StorageConnectionHandler}
    	store.Connect()
    
    	redisPurger := RedisPurger{Store: &store, Gw: gw}
    	go redisPurger.PurgeLoop(gw.ctx)
    
    	if gw.GetConfig().AnalyticsConfig.Type == "rpc" {
    		if gw.GetConfig().AnalyticsConfig.SerializerType == serializer.PROTOBUF_SERIALIZER {
    			mainLog.Error("Protobuf analytics serialization is not supported with rpc analytics.")
    		} else {
    			mainLog.Debug("Using RPC cache purge")
    
    			store := storage.RedisCluster{KeyPrefix: "analytics-", IsAnalytics: true, ConnectionHandler: gw.StorageConnectionHandler}
    			store.Connect()
    
    			purger := rpc.Purger{
    				Store: &store,
    			}
    			purger.Connect()
    			go purger.PurgeLoop(gw.ctx, time.Duration(gw.GetConfig().AnalyticsConfig.PurgeInterval))
    		}
    
    	}
    	go gw.flushNetworkAnalytics(gw.ctx)
    }
    
    // Load all the files that have the "error" prefix.
    templatesDir := filepath.Join(gwConfig.TemplatePath, "error*")
    gw.templates = htmltemplate.Must(htmltemplate.ParseGlob(templatesDir))
    gw.templatesRaw = texttemplate.Must(texttemplate.ParseGlob(templatesDir))
    gw.CoProcessInit()
    
    // Get the notifier ready
    mainLog.Debug("Notifier will not work in hybrid mode")
    mainNotifierStore := &storage.RedisCluster{ConnectionHandler: gw.StorageConnectionHandler}
    mainNotifierStore.Connect()
    gw.MainNotifier = RedisNotifier{mainNotifierStore, RedisPubSubChannel, gw}
    
    if gwConfig.Monitor.EnableTriggerMonitors {
    	h := &WebHookHandler{Gw: gw}
    	if err := h.Init(gwConfig.Monitor.Config); err != nil {
    		mainLog.Error("Failed to initialise monitor! ", err)
    	} else {
    		gw.MonitoringHandler = h
    	}
    }
    
    if conf := gw.GetConfig(); conf.AnalyticsConfig.NormaliseUrls.Enabled {
    	mainLog.Info("Setting up analytics normaliser")
    	conf.AnalyticsConfig.NormaliseUrls.CompiledPatternSet = gw.initNormalisationPatterns()
    	gw.SetConfig(conf)
    }
    
    certificateSecret := gw.GetConfig().Secret
    if gw.GetConfig().Security.PrivateCertificateEncodingSecret != "" {
    	certificateSecret = gw.GetConfig().Security.PrivateCertificateEncodingSecret
    }
    
    storeCert := &storage.RedisCluster{KeyPrefix: "cert-", HashKeys: false, ConnectionHandler: gw.StorageConnectionHandler}
    storeCert.Connect()
    gw.CertificateManager = certs.NewCertificateManager(storeCert, certificateSecret, log, !gw.GetConfig().Cloud)
    Webhook Validation Enhancement

    The added validation for empty TargetPath in webhooks is a good improvement. However, ensure this does not inadvertently block valid configurations or introduce regressions.

    if w.conf.Disabled || w.conf.TargetPath == "" {
    	log.WithFields(logrus.Fields{
    		"prefix": "webhooks",
    	}).Infof("skipping disabled webhook %s", w.conf.Name)
    	return ErrEventHandlerDisabled

    Copy link
    Contributor

    github-actions bot commented Jan 3, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Add error handling for Connect method calls to ensure robustness

    Handle potential errors from the Connect method calls to ensure proper error logging
    and recovery mechanisms.

    gateway/server.go [351]

    -healthCheckStore.Connect()
    +if err := healthCheckStore.Connect(); err != nil {
    +    mainLog.WithError(err).Error("Failed to connect healthCheckStore")
    +}
    Suggestion importance[1-10]: 9

    Why: Adding error handling for the Connect method improves robustness by ensuring proper error logging and recovery mechanisms. This is a significant enhancement to the reliability of the system.

    9
    General
    Make timeout values configurable to improve adaptability and maintainability

    Ensure the timeout value in context.WithTimeout is configurable to avoid hardcoding
    and allow flexibility in different environments.

    gateway/server.go [1281]

    -timeout, cancel := context.WithTimeout(context.Background(), time.Second)
    +timeout, cancel := context.WithTimeout(context.Background(), gwConfig.Storage.ConnectionTimeout)
    Suggestion importance[1-10]: 8

    Why: Making the timeout value configurable improves adaptability and maintainability, especially in environments with varying requirements. This is a valuable improvement, though the suggestion assumes the existence of gwConfig.Storage.ConnectionTimeout without verifying its presence in the PR.

    8
    Security
    Validate TargetPath to ensure it is a secure and valid path

    Add a check to ensure w.conf.TargetPath is a valid and secure path before proceeding
    to avoid potential misconfigurations or vulnerabilities.

    gateway/event_handler_webhooks.go [65]

    -if w.conf.Disabled || w.conf.TargetPath == "" {
    +if w.conf.Disabled || w.conf.TargetPath == "" || !isValidPath(w.conf.TargetPath) {
    Suggestion importance[1-10]: 7

    Why: Adding a validation check for TargetPath enhances security by ensuring it is a valid and secure path. This is a meaningful improvement to avoid potential misconfigurations or vulnerabilities. However, the implementation of isValidPath is not provided, which slightly reduces the score.

    7

    Copy link
    Contributor

    @jeffy-mathew jeffy-mathew left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    shall we have some sort of constructor/New method to get a new storage so that Connect is always called?

    gateway/server_test.go Outdated Show resolved Hide resolved
    gateway/server_test.go Outdated Show resolved Hide resolved
    @titpetric titpetric force-pushed the fix/sanitize-error-logging branch from 77614a4 to a00d748 Compare January 7, 2025 23:26
    Copy link
    Contributor

    @jeffy-mathew jeffy-mathew left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    lgtm, some nit changes left.
    leaving an approval.

    @titpetric titpetric force-pushed the fix/sanitize-error-logging branch from c1db7da to b8dca4e Compare January 9, 2025 22:42
    @titpetric titpetric enabled auto-merge (squash) January 9, 2025 22:42
    @titpetric
    Copy link
    Contributor Author

    shall we have some sort of constructor/New method to get a new storage so that Connect is always called?

    Yes, but need to cover it with docs first. It has about 50 coupling points so that would make this large cleanup much larger :) already feels like it could be two.

    Considering adding a semgrep rule to validate Connect is called as a patch until then. The naming of RedisCluster is also off and we're pushing into storage design pattern issues.

    Copy link

    sonarqubecloud bot commented Jan 9, 2025

    Quality Gate Failed Quality Gate failed

    Failed conditions
    1 Security Hotspot

    See analysis details on SonarQube Cloud

    @titpetric titpetric merged commit f8ca040 into master Jan 9, 2025
    29 of 41 checks passed
    @titpetric titpetric deleted the fix/sanitize-error-logging branch January 9, 2025 23:03
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants